Streaming Min-max Hypergraph Partitioning
نویسندگان
چکیده
In many applications, the data is of rich structure that can be represented by a hypergraph, where the data items are represented by vertices and the associations among items are represented by hyperedges. Equivalently, we are given an input bipartite graph with two types of vertices: items, and associations (which we refer to as topics). We consider the problem of partitioning the set of items into a given number of components such that the maximum number of topics covered by a component is minimized. This is a clustering problem with various applications, e.g. partitioning of a set of information objects such as documents, images, and videos, and load balancing in the context of modern computation platforms. In this paper, we focus on the streaming computation model for this problem, in which items arrive online one at a time and each item must be assigned irrevocably to a component at its arrival time. Motivated by scalability requirements, we focus on the class of streaming computation algorithms with memory limited to be at most linear in the number of components. We show that a greedy assignment strategy is able to recover a hidden co-clustering of items under a natural set of recovery conditions. We also report results of an extensive empirical evaluation, which demonstrate that this greedy strategy yields superior performance when compared with alternative approaches.
منابع مشابه
Approximation Algorithms for Independent Set Problems on Hypergraphs
This thesis deals with approximation algorithms for the Maximum Independent Set and the Minimum Hitting Set problems on hypergraphs. As a hypergraph is a generalization of a graph, the question is whether the best known approximations on graphs can be extended to hypergraphs. We consider greedy, local search and partitioning algorithms. We introduce a general technique, called shrinkage reducti...
متن کاملHigh Quality Hypergraph Partitioning via Max-Flow-Min-Cut Computations
In this thesis, we introduce a framework based on Max-Flow-Min-Cut computations for improving balanced k-way partitions of hypergraphs. Currently, variations of the FM heuristic [17] are used as local search algorithms in all state-of-the-art multilevel hypergraph partitioners. Such move-based heuristics have the disadvantage that they only incorporate local information about the problem struct...
متن کاملNetwork Flow-Based Refinement for Multilevel Hypergraph Partitioning
We present a refinement framework for multilevel hypergraph partitioning that uses max-flow computations on pairs of blocks to improve the solution quality of a k-way partition. The framework generalizes the flow-based improvement algorithm of KaFFPa from graphs to hypergraphs and is integrated into the hypergraph partitioner KaHyPar. By reducing the size of hypergraph flow networks, improving ...
متن کاملOn solving Mincut Balanced Circuit Partitioning Problem for Digital Circuit Layout using Evolutionary Approach with Solution Archive
The interest in finding an optimal partition in the area of VLSI has been a hot issue in recent years. Circuit Partitioning Problem is one of the most studied NP complete problems notable for its broad spectrum of applicability in digital circuit layout. The balanced constraint is an important constraint that obtains an area balanced layout without compromising the mincut objective. This paper ...
متن کاملContinuous bottleneck tree partitioning problems
We study continuous partitioning problems on tree network spaces whose edges and nodes are points in Euclidean spaces. A continuous partition of this space into p connected components is a collection of p subtrees, such that no pair of them intersect at more than one point, and their union is the tree space. An edge-partition is a continuous partition de3ned by selecting p − 1 cut points along ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015